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Abstract 

We develop a framework for approximation limits of polynomial- 
size linear programs from lower bounds on the nonnegative ranks 
of suitably defined matrices. This framework yields unconditional 
impossibility results that are applicable to any linear program as op- 
posed to only programs generated by hierarchies. Using our frame- 
work, we prove that 0(n 1 ^ 2 ~ e )-approximations for CLIQUE require 
linear programs of size 2* . (This lower bound applies to linear pro- 
grams using a certain encoding of CLIQUE as a linear optimization 
problem.) Moreover, we establish a similar result for approximations 
of semidefinite programs by linear programs. 

Our main ingredient is a quantitative improvement of Razborov's 
rectangle corruption lemma (1992) for the high error regime, which 
gives strong lower bounds on the nonnegative rank of certain pertur- 
bations of the unique disjointness matrix. 



1 Introduction 

1.1 Context 

Linear programs (L Ps) play a central role in the design of approxim ation al- 
gorithms, see, e.g., MVaziranil, |2001|, IWilliamson and ShmoysLEoilh . There- 



fore, understanding the limitations of LPs as tools for designing approxi 
mation algorithms is an important question. 

The first generation of results studied the limitations of specific LPs by 
seeking to determi ne their integrality gaps. The second generation of re- 
sults, pioneered by lArora et all l|2002ri I 1 ! studied the lim itations of LPs gen 



erated by lift-and-project procedures or hierarchies (e.g., ISherali and Adams 



[1990] and lLovasz and Schrijver! lll99lll ). See the previous work section be- 



low for a more detailed account of the relevant literature. 

In this work, we develop a framework for a third generation of results 
that apply to any LP for a given problem. For example, our lower bounds 
address the following question: Are there linear programming relaxations 
LP„ for CLIQUE of size poly(n) that achieve 0(1 ^approximations for all 
graphs with at most n vertices. (In this sense, we prove lower bounds in a 
model for non-uniform computation, whereas hierarchy lower bounds ap- 
ply to models for uniform computation.) 

Although we mainly focus on LPs, our framework readily generalizes 
to semidefinite programs (SDPs). 



Linear Encodings We study combinatorial optimization problem^! that 
can be encoded in a linear fashion by specifying a set of feasible solutions 
represented as binary vectors and a set of admissible (linear) objective func- 
tions represented by their coefficient vectors. An instance of a given linear 
encoding is specified by a dimension d and admissible objective function 
w £ R d . Solving the instance means finding a feasible solution x £ {0, l} d 
such that w J x = Ya=\ w i x i is minimum (or maximum). The optimum value 
of the instance is thus the minimum (or maximum) value of w J x for a fea- 
sible x <E {0,l} rf . 

We require that every instance of the problem can be mapped to an in- 
stance of the linear encoding in such a way that feasible solutions to an 

x We remark that lArora et al.1 |2002tl also considered relaxations that are not necessarily 
generated by specific lift-and-project procedures, but are otherwise restricted. However, 
these relaxations turn out to be captured by hierarchies. 

2 We assume some familiarity with combinatorial optimization. See, e.g.. lSchrijverll2003l l. 
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instance of the problem can be converted in polynomial time to feasible so- 
lutions to the corresponding instance of the linear encoding without deteri- 
orating their objective function values, and vice-versa. In this case, we say 
that a linear encoding faithfully encodes the problem. For graph problems 
such as the maximum clique problem (CLIQUE), such a linear encoding 
does not allow the set of feasible solutions to depend on the input graph, 
which is encoded solely in the objective function. 

For example, with the natural linear encoding of the metric traveling 
salesman problem (metric TSP) the feasible solutions are the incidence vec- 
tors of tours of the complete graph over [n] := {1,2, ... ,11} for some n ^ 3, 
and the admissible objective functions are all nonnegative vectors w = 
(wij) such that ^ Wjj + for all i, j and k in [n] . All vectors are encoded 
in K d , where d = Q. 

Coming back to the general case, a linear encoding determines two 
nested convex sets P C Q in R d for each d. The set P is the convex hull 
of the feasible solutions of dimension d (thus P is a 0/1-polytope, see Ap- 
pendix|A]for background on polytopes and polyhedra) and Q is defined by 
all inequalities of the form w J x ^ £ or iv J x ^ £ (for minimization and max- 
imization problems, respectively) satisfied by P where w is an admissible 
objective function of dimension d. 



(Approximate) Extended Formulations Returning to our previous exam- 
ple, it is known that the Held- Karp relaxation K of the metric TSP h as in- 
tegrality gap at most 3/2 (see iHeld and Karp Il97dl . IWolseyl [1980]). In 
geometric terms, this means that P Q K Q 2/3 ■ Q. Although K is defined 
by an exponential number of inequalities, it is known that it can be refor- 
mulated with a polynomial number of constraints by adding a polynomial 
number of variables. That is, the Held-Karp relaxation K has a polynomial- 
size extended formulation. 

Formally, an extended formulation (EF) of a polytope K C R d is a linear 
system in variables (x,y) G R d+k such that, for every x G R d , we have 
x G K if and only if there exists y G such that (x, y) is a solution to the 
system. The size of an EF is the number of inequalities in the system. Notice 
that an an EF can always be brought into slack form Ex + Fy = g, y ^ 
without increasing its size. We will mainly consider EFs in slack form. (For 
these, the size equals the number of extra variables.) 

The extension complexity xc(K) of the polytope K is defined as the min- 
imum size of an EF of K. Most of the LP relaxations that appear in the 
context of approximation algorithms actually have polynomial extension 
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complexity. This is in particular the case of the relaxations obtained from 
an initial polynomial size relaxation at a bounded level of any of the com- 
mon hierarchies. 

Let p ^ 1. Then we say that Ex + Fy — g, y ^ is a p-approximate 
EF of a given maximization problem, w.r.t. a given linear encoding of this 
problem, if the maximum value of w J x on K := {x G R rf | 3y : Ex + Fy = 
g, y ^ 0} is at least the optimum value for every w G R rf and at most p 
times the optimum value for every admissible w G R . Geometrically, this is 
equivalent to P C X C pQ. For minimization problems, the definitions are 
similar with p replaced by p~ l . In this case, we have P C X C p _1 Q. 

Nonnegative Factorizations A rank-r nonnegative factorization of an m x n 
matrix M is a decomposition of M as a product M = TU of nonnegative 
matrices T and LI of size m x r and r x n, respectively. The nonnegative 
rank rank + (M) of M is the minimum rank r of nonnegative factorizations 
of M. It is quite useful to notice that the nonnegative rank of M is also 
the minimum number of nonnegative rank-1 matrices whose sum is M. 
From this, we see immediately that the nonnegative rank of M is at least 
the nonnegative rank of any of i ts submatrices. 

The factorization theorem of Yannakakisl 1 1991 ] (see | Yannakakisl. 19881 ] 



for the conference version) states that extension complexity of a polytope 
K is precisely the nonnegative rank of any of its slack matrices. If K is the 
convex hull of {v\, . . . ,v n } C R d and the set of solutions to A\X ^ b\, . . . , 
A m x ^ b m then the slack matrix of K with respect to these outer and inner 
descriptions is the m x n nonnegative matrix S with entries S« := bj — AjVj. 
Yannakakis's theorem states that xc(K) = rank + (S) for every polytope K 
and every slack matrix S of K. 

The Link to Communication Complexity Yannakakis's factorization the- 
orem initiated an interplay between the extension complexity of polytopes 
and (classical) communication complexity]! The relevant concept here is 
randomized communication protocol with private randomness and non- 
negative outputs computing a (nonnegative) function M : X x Y — > R + in 
expectation. For the sake of simplicity, we call this a protocol computing M in 
exp ectation. 



Faenza et al.l [201 ll ] show that, considering M as a matrix, the complex- 



ity of computing M in expectation equals log(rank + (M)) + ®(1). Thus 



3 We also assume som e familiarity with communication complexity. See 
iKushilevitz and Nisanl Il997ll . 



3 



proving bounds on the nonnegative rank of M amounts to proving bounds 
on the required amount of communication for computing M in expectation. 

It is not hard to see that this last quantity is bounded from below by the 
nondeterministic communication complexity of the support of M because 
every protocol computing M in expectation can be turned into a nonde- 
terministic protocol for the support of M. Equivalently the nonnegative 
rank of the matrix M is bounded from below by the minimum number of 
1-monochromatic rectangles covering the support of M. Similarly when- 
ever the variance is not too large, a protocol computing M in expectation 
can b e turned into a rand omized protocol computing M with high proba- 



bility [Faenza et alJ.l2011f |- 



(Unique) Disjointness In the disjointness problem (DISJ), both Alice and 
Bob receive a subset of [n] . They have to determine whether the two subsets 
are disjoin t. The disjointness problem is ce ntral to communication com- 
plexity see lChattopadhyay and Pitassi for a survey. 



A related problem that captures the hardness of the disjointness prob- 
lem is the unique disjointness problem (UDISJ), that is, the promise version 
of the disjointness problem where the two subsets are guaranteed to have 
at most one element in common. Denoting the binary encoding of the 
sets of Alice and Bob by a, b G {0, 1}", respectively, this amounts to com- 
puting the Boolean function UDISJ(fl, b) := 1 — a J b on the set of pairs 
(a, b) G {0, 1}" x {0, 1}" with aJb G {0, 1}. Viewing it as a partial 2" x 2" 
matrix, we call UDISJ the unique disjointness matrix. 

It is known that the communication complexity of UDISJ is O (ft) bits for 
deter ministic, nondeterministic and even randomized communication pro- 
tocol s iKalyanasundaram and Schnit ger, ll99ilRazbor"ovl.ll992|] Bar-Yossef et al 



2004]. One consequence of this is that the nonnegative rank of any matrix 
obtained from UDISJ by filling arbitrarily the blank entries (for pairs (a, b) 
with a J b > 1) and perhaps adding rows and/or columns is still 2^ n \ In- 
deed, the support of the resulting matrix has Q(n) nondeterministic com- 
munication complexity because it contains UDISJ. 

1.2 Previous Work 



In a recent paper iFiorini et al. j2012h proved strong lower bounds on the 



size of LPs expressing the traveling salesman problem (TSP), or more pre- 
cisely on the size of EFs of the TSP polytope. Their proof works by em- 
bedding the UDISJ in a slack matrix of the TSP polytope of t he complete 



graph on 0(n ) vertices. This solved a question left open in lYannakakis 
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1 199lf l. We use a similar approach for approximate EFs, which requires 
lower bounds on the nonnegative rank of partial matrices obtained from 
the UDISJ matrix by adding an offset to all the entries. 

Our results are closely related to previous work in communication com- 
plexity for the (unique) disjointness problem and related problems. Lower 
bounds of O(n) on the randomized, bo unded error communication com- 
plexit y of d isjointness were established in lKalyanasundaram and Schnitger 
(l992fl. In iRazborovl (l992fl the distributional complexity of unique dis- 
jointness problem was analyzed, which in particular implies the result of 



Kalyanasundaram and Schnitgerl 119921 1 . The main tool here is Razborov's 



rectangle corruption lemma showing that in every large rectangle, the num- 
ber of 0-entries is proportional to the number of 1-entries. This ensures that 
monochromatic 1-rectangles have to be small and therefore a large num- 
ber is needed to cover all 1-entries; a lower bound for the nondeterminis- 
tic communication complexity. It is precisely this lemma that was used in 
Fiorini et al. I j2012h to establish lower bounds on the extension complexity 



of the cut polytope, the stable set polytope, and the TSP polytope. The most 
recent proof that the random ized, bounded error co mmunication complex- 
ity of DISJ is Q(n) is due to iBar-Yossef et al.1 12004 1 and is based on infor- 
mation theoretic arguments. This leads to a lower bound for randomized 
communication within a high-error regime. Here we derive a strong gen- 
eralization dealing with perturbations for approximate EFs and we recover 
the high-error regime bound. 

There has been extensive work on LP and SDP hierarchi es/ relaxations 



and th eir limitations; we will be only able to list a few here. In lCharikar et al 
]2009], strong lower bounds (of 2 — e) on the integrality gap for n £ rounds 
of the Sherali-Adams hierarchy when applied to (natural relaxations of) 
VERTEX COVER, Max CUT, SP ARSEST CUT have b een been established 
via embeddings into £2; see also Charikar et al.1 [ 2010 ] for limits and trade- 
offs in metric embeddings. For integrality g aps of linear (and a lso SDP) 
relaxations for the KNAPSACK problem see iKarlin et al.1 [201 ll ]. A nice 
overview of the differences and commonalities of the Sherali-Adams, the 
Lovasz-Schrijy er and the Lasserre hierarchies / relaxations can be found in 



Laurent! l2003l l. Rank lower bou nds of n for Lovasz-Sch rijver relaxations of 



CLIQUE have been obtained in ICook an d Dash l200lh : a similar result for 
Shera li-Adams hierarchy can be found in lLaurentl 120031 1. In lSingh and Talwar 
j201C)l l integrality gaps, after adding few rounds of Chvatal-Gomory cuts, 
have been studied for problems including k-CSP, Max CUT, VERTEX COVER, 
and UNIQUE LABEL COVER showing that in some cases (e.g., fc-CSP) the 
gap can be significantly reduced whereas in most other cases the gap re- 
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mains high. In the context of SDP relaxations, in p articular formulations 
deriv ed from the Lovasz-Schrijver N+ hi erarchies (seelLovasz and Schrijver 
J 1991 1) and the L asserre hierarchies (see Lasserre 1 2002h '). For example, in 



Arora et al. |(2009j]anO(yiogn) upper bound on a suitable SDP relaxation 



of the SPARS EST CUT problem w as obtained. For lower bounds in terms of 
rank, see e.g., ISchoenebeck1 i2008ll for the fc-CSP in the Lasserre hierarchy or 
Schoenebeck et al.1 i2007fr ~for VERTEX COVER in the semidefinite Lovasz- 



Schrijver hierarchy. Motivated by the Unique Games Conjecture, several 
works studied upper and lower bounds for SDP hierarchy r e laxations of 



Unique Gam es (se e for example, iGuruswami and Sinopl 11201 11 1, iBarak et al . 
12ml l2012bllak In Fiorini et al. I i2012l l a characterization of semidefinite EF 
via one-way quantum communication complexity is established. 

Approxima te EFs have been studied before, for speci fic problems, e.g., 
KNA PSACK in lBienstockl 120081 1. or as a general tool, see lVyve and Wolsey 
1 2006 11 . The idea o f considering a pair of polytopes P, Q first appeared in 



Pashkovichl |2012T| and similar ideas appeared earlier in lGillis and Glineur 



1 2010]. For recent results on computation of nonnegative rank see lArora et al 
Il2012n . 



1.3 Contribution 

The contribution of the present paper is threefold. 

(i) We develop a new framework for proving lower bounds on the sizes 
of approximate EFs. As a generalization of Yannakakis's factorization 
theorem, we characterize the minimum size of a p-approximate EF as 
the nonnegative rank of any slack matrix of a pair of nested polyhe- 
dra. Thus we reduce the task of proving approximation limits for LPs 
to the task of obtaining lower bounds on the nonnegative ranks of 
associated matrices. Typically, these matrices have no zeros, which 
renders it impossible to use nondeterministic communication com- 
plexity. We emphasize the fact that the results obtained within our 
framework are unconditional. In particular, they do not rely onP / 
NP. 

(ii) We extend Razborov's rectangle corruption lemma to deal with per- 
turbations of the UDISJ matrix. As a consequence, we prove that the 
nonnegative rank of any matrix obtained from the UDISJ matrix by 
adding a constant offset to every entry is still 2 n ("'. Moreover, the 
nonnegative rank is still 2 n (" 2<? ) when the offset is at most n 1//2 ~ e . To 
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our knowledge, these are the first strong lower bounds on the non- 
negative rank of matrices that contain no zeros. (Furthermore, the 
relative difference between any two entries of our perturbed UD- 
ISJ matrices is tiny.) Our extension of Razborov's lemma allow us 
to recover known low er bounds for DISJ in the high-error regime of 



Bar-Yossef et all 1120041 



(iii) We obtain a strong hardness result for CLIQUE under a natural linear 
encoding of CLIQUE. From the results above, we prove that the size 
of every 0(n 1/2 - £ )-approximate EF for CLIQUE is 2 n (" f ). We see this 
as the first step in obtaining lower bounds on the sizes of approximate 
EFs for (faithful linear encodings of) other problems. Finally, we ob- 
serve that the same bounds hold for approximations of SDPs by LPs. 
This suggests that SDP-based approximation algorithms can be sig- 
nificantly stronger than LP-based approximation algorithms. The in- 
approximability of SDPs by LPs has some interesting consequences. 
In particular we cannot expect to convert SDP-based approximation 
algorithms into LP-based ones by approximating the PSD-cone via 
linear programming. Moreover it might indicate that it is not possible 
to achieve the same approximation guarantee with LPs for Max CUT 



to acnieve tne same approximation guarantee witn Lrs tor Max lui 
as with the SDP-based algorithm by lGoemans and Williamson Jl995ll 



whi ch is known to be optimal assuming the Unique Games Conjec- 
ture Hhotl k002h . lKhotetai] 12004 1 . iMossel et all j2005l l. 



Finally, we point out that our framework read ily generalizes to SD Ps 
by replacing nonn egative rank with PSD rank (see lGouveia et al.l |201ll ] or 
Fiorini et al.1 12012M for a definition of the PSD rank). 



1.4 Outline 

We begin in Section [2] by setting up our framework for studying approxi- 
mate extended formulations of combinatorial optimization problems. Then 
we extend Razborov's rectangle corruption lemma in Section|3]and use this 
to prove strong lower bounds on the nonnegative rank of perturbations of 
the UDISJ matrix. Finally, we draw consequences for CLIQUE and approx- 
imations of SDPs by LPs in Section 0J 
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2 A Framework for Approximation Limits of LPs 



In this section we establish the basics of our framework for studying ap- 
proximation limits of LPs. First, we define in details the concepts of lin- 
ear encodings and approximate extended formulations. Second, we prove 
a factorization theorem for pairs of nested polyhedra reducing existential 
questions on approximate EFs to the computation of nonnegative ranks of 
slack matrices. 

2.1 Linear Encodings of Problems and Approximate EFs 

A linear encoding of a (combinatorial optimization) problem is a pair (£, O) 
where C C {0, 1}* is the set of feasible solutions to the problem and OCR* 
is the set of admissible objective functions. An instance of the linear encoding 
is a pair (d,w) where d is a positive integer and w G O n lR d . Solving 
the instance (d,w) means finding x G C Pi {0, l} d such that iv J x is either 
maximum or minimum, according to the type of problem at hand. 

For every fixed dimension d, a linear encoding (C,0) naturally defines 
a pair of nested convex sets PC Q where 



Q := {x G R d | Vw G O n R d : w J x s£ max{it> T x | x G P}} 
if the goal is to maximize and 

Q := {x G M d | \/w G D R rf : w T x > min{ry T x | x G P}} 

otherwise. Intuitively, the vertices of P encode the feasible solutions of 
the problem under consideration and the defining inequalities of Q encode 
the admissible linear objective functions. Notice that P is always a 0/1- 
polytope but Q might be unbounded and, in some pathological cases, non- 
polyhedral. Below, we will mostly consider the case where Q is polyhedral, 
that is, defined by a finite number of "interesting" inequalities. 

Given a linear encoding (C f O) of a maximization problem, and p ^ 1, 
a p-approximate extended formulation (EF) is an extended formulation Ex + 
Fy = g, y ^ with (x,y) G K rf+ '" such that max{i« T i | Ex + Fy = g, y ^ 
0} ^ max{iv J x \ x G P} for all w G R rf and max{a; T i | £x + Fy = g, y ^ 
0} ^ pmax{iv^x \ x G P} for all u> G O D U d . Letting K := {x G R rf | 3y G 
R r : Fx + Fy = g, y ^ 0}, we see that this is equivalent to P C K C pQ. 
For a minimization problem, we require min{zu T x | Fx + Fy = g, y ^ 
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0} ^ min{w T x | x G P} for all w G R d and min{a; T x | Ex + Fy = g, y ^ 
0} ^ |0 _1 min{a; T x | x G P} for all G fl This is equivalent to 
PCKC p-lQ. 

We require the following faithfulness condition: every instance of the 
problem can be mapped to an instance of the linear encoding in such a way 
that feasible solutions to an instance of the problem can be converted in 
polynomial time to feasible solutions to the corresponding instance of the 
linear encoding without deteriorating their objective function values, and 
vice-versa. Roughly speaking, we ask that each instance of the problem can 
be encoded as an instance of the linear encoding. 

For example, consider the maximum /c-SAT problem (Max /c-SAT) with 
k constant. Letting U\, u n denote the variables of a Max fc-SAT instance, 
we encode the problem in dimension d = ©(n k ) . For each nonempty clause 
C of size at most k, we introduce a variable Xc- Collectively, these variables 
define a point x G R d . Given a truth assignment, we set xq to 1 if C is sat- 
isfied and otherwise we set xq to 0. Letting n vary, this defines a language 
C C {0, 1}*. We let O := {0, 1}*. 

The pair (C,0) defines a linear encoding of Max /c-SAT because each 
instance of Max /c-SAT can be encoded as an instance of (£, O). More pre- 
cisely, to any given set of clauses over n variables, we can associate a di- 
mension d = ©{n ) and weight vector w G {0, l} d such that maximizing 
yjwcXc for x G C fl {0, l} d corresponds to finding a truth assignment that 
maximizes the number of satisfied clauses. 

Finally, we remark that the EF defined by the inequalities ^ xq ^ 1 

and Xc ^ HuieC x {iii} + Lu,ec(l ~~ ■*•{«;}) ^ or a ^ clauses C is a polynomial- 

size 4 / 3-approximate EF for Max fc-SAT, as follows from lGoemans and Williamson 
Il994h . 



2.2 A Factorization Theorem for Pairs of Nested Polyhedra 

Let P be a polytope and Q be a polyhedron with P C Q C R rf . An extended 
formulation (EF) o/ i/ie pair P, Q is a system Ex + Fy = g, y ^ defining a 
polyhedron K := {x G R d | Ex + F y = g, y > 0} such that P QKQQ.We 
denote by xc(P, Q) the minimum size of an EF of the pair P, Q. 

Now, consider an inner description of P and an outer description of Q, 
say P := conv(y) and Q := {i 6 R d Ax ^ fr} where V := {wi, . . . ,u n } 
and Ax ^ b has m inequalities denoted by A\x b\, . . . , A m x ^ & w . The 
sZflcfc matrix of the pair P, Q w.r.t. these inner and outer descriptions is the 
m x n matrix S F '® with S-- = fc; — AjVj for i G [m] and / G [n]. 
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Our first result gives an exact characterization of xc(P, Q) in terms of the 
nonnegative rank of the slack matrix of the pair P, Q. It states that the min- 
imum extension complexity of a polyhedron sandwiched between P and 
Q is exactly x c(P, Q). The result readily generalizes Yannakakis' factoriza- 



tion theore m lYannakakisl. Il991 1. which concerns the case P = Q. It first 
appeared in lPashkovich "i2012l l. The proof can be found in Appendix lB.il 



Theorem 1. With the above notations, we have xc(P, Q) = rank + (S p, Q) for 
every slack matrix of the pair P, Q. The minimal value is realized by an EF where 
K is a polytope. 

Let P, Q be as above and p ^ 1. Then pQ = {i G R 1 * Ax ^ pb} and 
the slack matrix of the pair P, pQ is related to the slack matrix of the pair 
P, Q in the following way: 

S^ pQ = pbi - AiVj = {p- l)bi + bi - AiVj = S*f Q + (p- l)bi. 
Theorem [JJ directly yields the following result. 

Theorem 2. Consider a maximization problem and linear encoding for this prob- 
lem. Let P, Q C R d be polyhedral and associated with the linear encoding, and 
let p ^ 1. Consider any slack matrix S P, Q for the pair P, Q and the corre- 
sponding slack matrix S F ' P ® for the pair P,pQ. Then the minimum size of a p- 
approximate EF of the problem, w.r.t. the considered linear encoding, is exactly 
rank + (S p "°Q). For a minimization problem, the minimum size of a p-approximate 
£Fzsrank + (S p '^ 1 6). 

Fixing p ^ 1, Theorem|2]characterizes the minimum number of inequal- 
ities in any LP providing a ^-approximation for the problem under consid- 
eration. We point out that the theorem directly general i zes to SDPs, by re- 



placi ng nonnegative rank by PSD rank [Gouveia et all l201ll, iFiorini et al 



20121 1. Here, we focus on LPs and nonnegative rank. As a matter of fact, 



strong lower bounds on the PSD rank seem to be currently lacking. 



2.3 A Problem with no Polynomial-Size Approximate EF 

We conclude this section with an example showing the necessity to restrict 
the set of admissible objective functions rather than allowing every ai£R* 
(that is P = Q). 

Let K n = (V n ,E n ) denote the n-vertex complete graph. For a set X of 
vertices of K n , we let S(X) denote the set of edges of K n with one endpoint 
in X and the other in its complement X. This set S(X) is known as the 
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cut defined by X. For a subset F of edges of K n , we let x F £ denote 
the characteristic vector of F, with ^ = 1 if e G F and ^ = otherwise. 
The cut poly tope CUT(n) is defined as the convex hull of the characteristic 
vectors of all cuts in the complete graph K n = (V n , E n ). That is, CUT(n) := 

conv ({x s{x) e IR E " \ xcv n }). 

Consider the maximum cut problem (Max CUT) with arbitrary weights, 
and its usual linear encoding. With this encoding we have P = Q = 
CUT(h). Our next result states that this problem has no p-approximate 
EF, whatever p ^ 1 is. Intuitively, this phenomenon stems from the fact 
that, because is a vertex of the cut polytope, every approximate EF neces- 
sarily "captures" all facets of the cut polytope incident to (see Figure [Din 
Appendix IB. 2\ . These facets define the cut cone, which has high "extension 
complexity". A proof sketch is given in Appendix[B] 

Proposition 3. For every p ^ 1, every p-approximate EF of the Max CUT prob- 
lem with arbitrary weights has 2°^"' size. More precisely, disregarding the value 
ofp ^ 1, we have xc(CUT(n),p CUT \n)) = 2°^"). 



3 Extending Razborov's Lemma and 
Perturbing Unique Disjointness 

In the first subsection we generalize R az borov's famous lemma on the dis- 
jointness problem (see Razborov 1 1992f | or Kushilevitz and Nisanl ] 1997 , Lemma 4.49] 
for the original version). In the next subsection we apply it to perturb the 
UDISJ matrix without significantly decreasing its nonnegative rank, which 
will be used in later sections to obtain lower bounds on approximate ex- 
tended formulations. 

The main improvements to Razborov's lemma are three-fold all serving 
ease of application to extended formulations: (i) parametrizing constants 
in order to use their optimal value; (ii) probabilities are generalized to ex- 
pected values to allow rank-1 matrices instead of product sets and incorpo- 
ration of non-negative rank; (iii) better analytical estimations are employed 
to improve overall strength of statement. 



3.1 Extending Razborov's Rectangle Corruption Lemma 

For every < p < 1 and 1 < £ < n/2 we define the following distribution 
p of random subsets a and b of size i of [n]. We flip a biased coin and 
with probability p, we choose (a, b) uniformly among the pairs of subsets 
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intersecting in exactly one element; with probability 1 — p, we choose (a, b) 
uniformly among the pairs of disjoint subsets. 

Lemma 4. For every < p < 1, n > 3, 1 < £ < (n + 1) / 4 let p be the 
probability distribution above. Furthermore, let A = {(a,b) \ a n b = 0} denote 
the event that a and b are disjoint, and B = { (a, b) \ \a n b\ = 1} denote the event 
that a and b intersect in exactly one element. For every sequence of nonnegative 
functions f\,g\,. ■ ■ ,fr,gr defined on the subsets of [n], we introduce a random 
variable X := Y%=i fi( a )gi{b)- Then for every < e < 1 

E[XJ B ] ^ (l-e) Y ^B[XI A ]-rp\\XI A \\ c>a 2-^+o(io g i) / (1) 

where 0(\og£) is a function only in £ and I a and Ib are the indicators of the 
events A and B respectively. 

A strengthened version of the original lemma is recovered by the choice 
p = 1/4, r = 1, £ = (n + l)/4 and X the characteristic function of the 
rectangle R = C x D. 



Our proof, given in Appendix IC.ll is inspired by the version in lKushilevitz and Nisan 



119971, Lemma 4.49] and we adapt similar notation. 



3.2 Randomized Communication Complexity of Disjointness 

Using our quantitative version of Razborov's rectangle bound (Lemma H}, 
we show a lower bound on the randomized communication complexity of 
UDISJ in the high-error regime (where the protocol is allowed to err with 
probability 1/2 — e). This lower bound matches the best known on^| (see 
Bar-Yossef et al. 1 20041 ] ). It follows that the parameters of Lemma|4]are tight 



unless a better lower bound for the communication complexity of UDISJ 
holds in this error regime. See Appendix IC.2I f or the proof. 

Corollary 5 (Lower bound for randomized communication with high er- 
ror). For e > we have 

R 1/2 - e (UDISJ n ) > e 2 G(n), 
where the constant hidden in Cl(n) does not depend on e. 

Clearly the result also holds for the weaker DISJ problem. 



4 For an explicit statement of this bound, see lectures notes of the 
2011/2012 communication complexity course at TIFR/IMSc, available at 
|http://www.tcs.tifr .re s . in/~prahladri/teaching/2011- 12/ comm/lectures/112 .pdf 
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3.3 Lower Bounds for Perturbations of Unique Disjointness 

Now we apply Lemma [4] to show that the nonnegative rank (and hence the 
complexity of computation in expectation) of any perturbed version of the 
unique disjointness matrix remains high. More precisely let M G x2 ; 
for convenience we index the rows and columns with elements in {0, 1}". 
We say that M is a p-extension of UDISJ, if M a \, = p — 1 whenever \a Pi b\ = 
1 and M ab = p whenever a n b = with a,b G {0,1}". Note that for 
these pairs M has exclusively positive entries whe never p > 1. For p = 1 



a nonnegative rank of 2 n w was already shown in iFiorini et al.1 [2012D via 
nondeterministic communication complexity. We now extend this result 
for a wide range of p using Lemma [H 

Theorem 6 (Nonnegative rank of UDISJ perturbations). Let M G IR+ x2 " be 

a p-extension of UDISJ as above. If 

(i) p is a fixed constant, then rank + (M) = 2 n ("'. 
{ii) p = 0(nP)for some constant f> < 1/2 then rank + (M) = 2 n( " 1 ~ 2/S ). 

Proof. Regarding the 2" x 2" matrix M as a random variable over 2^ x 7\ n \ 
we apply Lemma [4] to X := M. Suppose that M has a rank-r nonnegative 
factorization. Therefore we can write X as X(a, b) = Yh=\ fi( a )gi{b) where 
fi and gi are nonnegative functions defined over [2 n ] with i G \r\. Note that 
MI a = pi a and MIb = (p — 1)Ib and so Equation Q} reduces to 

p{p - 1) ^ (1 - -p)-p-rp-p- 
which gives the lower bound 

r ^ f- - e \ 2i^2 f! +°( l0 8"). 

If p is constant, this last expression is 2 n (") provided e is chosen sufficiently 
close to 0. This proves part[(i)]of the theorem. 

If p ^ CnP for some positive constant C, then we can take e = 2^p- 
Thus j - e ^ = ^(m-/ 3 ). This leads to the lower bound r ^ 2 n (" 1_2 ' i ) 
as claimed in part |(ii)| □ 
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4 Polyhedral Inapproximability of CLIQUE and SDPs 



We will now use Theorem[6]in combination with Theorem[2]to lower bound 
the sizes of certain approximate EFs. First, we pinpoint a pair P, Q of nested 
polyhedra that will be the source of our polyhedral inapproximability re- 
sults. Second, we give a faithful linear encoding of CLIQUE and prove 
strong lower bounds on the sizes of approximate EFs for CLIQUE w.r.t. 
this encoding. Third, we focus on approximations of SDPs by LPs. 



4.1 A Hard Pair 

Let n be a positive integer. The correlation poly tope COR(n) is defined as the 
convex hull of all the n x n rank-1 binary matrices of the form bb T where 
b G {0,1}". In other words, COR(ft) = conv({M?T | G {0,1}"}). This will 
be our inner polytope P. Next, let Q = Q(n) := {x G R" xn | (2diag(a) - 
aa J ,x) ^ 1, a G {0, 1}"}, where (•,•) denotes the Frobenius inner product. 
This will be our outer polyhedron Q. 



Then the following is known, see IIFiorini et all |2012|| . First, P C Q. 
Second, denoting by S P, Q the slack matrix of the pair P, Q, we have S*/ = 

(1 — a J b) 2 . Thus, for p ^ 1, we have S F f = (1 — a J b) 2 + p — 1. Observe 
that the matrix S P 'P® is a p-extension of UDISJ and therefore has high non- 
negative rank via Theorem |6| moreover it has positive entries everywhere 
for p > 1. Together with Theorem [TJ this implies that every polytope sand- 
wiched between P = COR(n) and pQ has large extension complexity. We 
obtain the following theorem. 

Theorem 7 (Lower bounds for approximate EFs of the hard pair). Let p ^ 1, 

let nbe a positive integer and let P = COR(n), Q = Q(n) be as above. Then the 
following hold: 

(i) If p is a fixed constant, then xc(P,pQ) = 2 n ("). 
Hi) Ifp = O(n^) for some constant $ < 1/2, then xc(P,pQ) = 2 a ( nl ~ lfi ). 



4.2 Polyhedral Inapproximability of CLIQUE 

We define a convenient linear encoding for the maximum clique problem 
(CLIQUE) as follows. Let n denote the number of vertices of the input 
graph. We define a d = n 2 dimensional encoding. The variables are de- 
noted by Xij for i,j G [ft]. Thus x G R" x ". The interpretation is that a set of 
vertices X is encoded by Xjj = 1 if i, j E X and Xjj = otherwise. Note that 
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X = {i : Xjj = 1 can be recovered from only the diagonal variables. This 
defines the set £ C {0,1}* of feasible solutions. Notice that x G {0, l} nx " 
is feasible if and only if it is of the form x = bb J for some b G {0,1}", 
the characteristic vector of X. Thus we have P = COR(n) for the inner 
polytope. 

An objective function w G R f!X,! is admissible if Wu G {0, 1} for the di- 
agonal coefficients and Wu = Wjj G { — 1, 0} for the off-diagonal coefficients. 
This defines the set O C {—1,0, 1}* of admissible objective functions. 

Given a graph G such that V(G) C [n], we let Wu := 1 for i G V(G), 
wu := for i G [n] \ V(G), Wu = Wu := —1 when ij is a non-edge of G, and 
Wjj = Wji := otherwise. We denote the resulting weight vector by w . 
Notice that for a graph G with V(G) = [n], we have w G = I — A(G) where I 
is the n x n identity matrix, A ( G) is the adjacency matrix of the complement 
of G. A feasible solution x = bb J G {0, l}" xn maximizes (iv G ,x) only if b 
is the incidence vector of a clique of G. Indeed, if b = x X and i,j G X then 
removing i or j from X increases (w, x) . Moreover, the maximum of {w G , x) 
over x G {0, l} nx " feasible is the clique number co(G). Therefore, (C,0) 
defines a valid linear encoding of CLIQUE. We denote the outer convex 
set of this linear encoding by Q al1 . It is actually the polyhedron defined as 
Q al1 = {x G R" x " | V graphs G s.t. V(G) C [n] : (w G ,x) ^ 07(G), Vi ^ / G 
[n] : ^ 0}. 

Because Q al1 is contained in the polyhedron Q defined above, every K 
satisfying P C K C pQ"" also satisfies P C K C pQ. Hence, Theorem [7] 
yields the following result. See Appendix O for the proof. 

Theorem 8 (Polyhedral inapproximability of CLIQUE). W.r.t. the linear en- 
coding defined above, CLIQUE has an 0(n 2 )-size n-approximate EF. Moreover, 
every n 1/2 ~ £ -approximate EF of CLIQUE has size 2 n (" e \for all < e < 1/2. 



4.3 Polyhedral Inapproximability of SDPs 



In this section we establish the existence of a spectrahedron with small 
semidefinite extension complexity but high approximate extension com- 
plexity; i.e., any sufficiently fine polyhedral approximation is large. This 
indicates that in general it is not possible to approximate SDPs arbitrar- 
ily well using LPs, so that SDPs are indeed a much stronger class of op- 
timization problems. (The situ ation looks quite different for SOCPs, see 



Ben-Tal an d Ne mirovskil l200ll l.) The result follows from Theorem [7] and 



Fiorini et all H2012I1 . 
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We denote the cone of all r x r symmetric positive semidefinite matrices 
(shortly, the PSD cone) by S+ . A semidefinite EF of a convex set S C R rf is a 
system Ex + Fy = g, y G S+ such that x G S if and only if Ely G R r ( , '+ 1 ) /2 
with Ex + Fy = g,y G SI . Thus a convex set admits a semidefinite EF if and 
only if it is a spectrahedron. The size of the semidefinite EF Ex + Fy = g, 
y G S+ is simply r. The semidefinite extension complexity of a spectrahedron 
S C R rf is the minimum size of a semidefinite EF of S. This is denoted by 
xc SDP (S). 

Our final result is the following inapproximability theorem for spectra- 
hedra. 

Theorem 9 (Polyhedral inapproximability of SDPs). Let p ^ 1, and let n be 

a positive integer. Then there exists a spectrahedron S C ]R" X " with xcsdp(S) ^ 
n + 1 such that for every polytope K with S C K C pS the following hold: 

(i) If p is a fixed constant, then xc(K) = 2 n ("). 

(ii) Ifp = 0{nP) for some constant f> < 1/2, then xc(K) = 2 n ( nl ~ 2fi ). 

5 Concluding Remarks 

We have introduced a general framework to study approximation limits of 
small LP relaxations. Given a polyhedron Q encoding admissible objective 
functions and a polytope P encoding feasible solutions, we have proved 
that any LP relaxation sandwiched between P and a dilate pQ has extension 
complexity at least the nonnegative rank of the slack matrix of the pair P, 

PQ. 

This yields a lower bound depending only on the linear encoding of the 
problem at hand, and applies independently of the structure of the actual re- 
laxation. By doing so, we obtain unconditional lower bounds on integrality 
gaps for small LP relaxations, which hold even in the unlikely event that 
P = NP. 

We have proved that every polynomial-size LP relaxation for (a natural 
linear encoding of) CLIQUE has essentially an Cl(^/n) integrality gap. 

Finally, our work sheds more light on the inherent limitations of LPs in 
the context of combinatorial optimization and approximation algorithms, 
in particular, in comparison to SDPs. We provide strong evidence that 
certain approximation guarantees can only be achieved via non-LP-based 
techniques (e.g., SDP-based or combinatorial). 

We are convinced that our framework can be used to obtain strong ap- 
proximation limits for (LP relaxations of) of other well-known problems 
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such as Max CUT, Max fc-SAT and VERTEX COVER. The following impor- 
tant questions remain open. 



(i) Is it possible to show a constant-factor polyhedral inapproximability 
for Max CUT with nonnegative weights (and similarly for VERTEX 
COVER and many more) for any polynomial-size LP? We conjecture 
that it is not possible to approximate Max CUT with LPs of poly-size 
within a factor better than 2. 

(ii) So far no strong lower b ounding technique fo r semidefinite EFs are 



bo tar no strong lower Dounaing te cnnique to r semidennite hts are 
known. Recent work by Lee and Theisl | j2012l l provides hope to ob- 



tain such lower bounds. In fact the authors introduce a combinatorial 
lower bounding technique that they apply to relaxations of Max CUT 
and TSP Although the details of their approach are not yet available, 
it is plausible that in the near future we will see lower bounding tech- 
niques on the PSD rank that would be suited for studying approxi- 
mation limits of SDPs. (We remark however that such bounds should 
not only argue on the zero / nonzero pattern of a slack matrix.) 



References 

S. Arora, B. Bollobas, and L. Lovasz. Proving integrality gaps without 
knowing the linear program. In Proc. FOCS, pages 313-322, 2002. 

S. Arora, S. Rao, and U. Vazirani. Expander flows, geometric embeddings 
and graph partitioning. /. ACM, 56(2):5, 2009. 

S. Arora, R. Ge, R. Kannan, and A. Moitra. Computing a nonnegative ma- 
trix factorization-Provably. accepted for STOC 2012, 2012. 

Z. Bar-Yossef, T. Jayram, R. Kumar, and D. Sivakumar. An information 
statistics approach to data stream and communication complexity. /. 
Comput. System Sri., 68(4):702-732, 2004. 

B. Barak, P. Raghavendra, and D. Steurer. Rounding semidefinite program- 
ming hierarchies via global correlation. In Proc. FOCS, pages 472-481. 
IEEE, 2011. 

B. Barak, F. G. Brandao, A. Harrow, J. Kelner, D. Steurer, and Y. Zhou. Hy- 
percontractivity, sum-of-squares proofs, and their applications. In STOC, 
2012a. To appear. 



17 



B. Barak, P. Gopalan, J. Hastad, R. Meka, P. Raghavendra, and D. Steurer. 
Making the long code shorter, 2012b. Manuscript. 

A. Ben-Tal and A. Nemirovski. On polyhedral approximations of the 
second-order cone. Math. Oper. Res., 26:193-205, 2001. 

D. Bienstock. Approximate formulations for 0-1 knapsack sets. Oper. Res. 
Lett., 36(3):317-320, 2008. 

M. Charikar, K. Makarychev, and Y. Makarychev. Integrality gaps for 
Sherali- Adams relaxations. In Proc. STOC, pages 283-292. ACM, 2009. 

M. Charikar, K. Makarychev, and Y. Makarychev. Local global tradeoffs in 
metric embeddings. SIAM J. Comput., 39(6):2487-2512, 2010. 

A. Chattopadhyay and T. Pitassi. The story of set disjointness. SIGACT 
News, 41:59-85, 2010. 

W. Cook and S. Dash. On the matrix-cut rank of polyhedra. Math. Oper. 
Res., 26:19-30, 2001. 

Y. Faenza, S. Fiorini, R. Grappe, and H. R. Tiwary. Extended formulations, 
non-negative factorizations and randomized communication protocols. 
larXiv: 1105.41271 2011. 

S. Fiorini, S. Massar, S. Pokutta, and R. de Wolf. Linear vs. semidefinite ex- 
tended formulations: Exponential separation and strong lower bounds, 
accepted for STOC 2012, 2012. 

N. Gillis and F. Glineur. On the geometric interpretation of the nonnegative 
rank. larXiv: 1009.08801 2010. 

M. X. Goemans and D. P. Williamson. A new 3/ 4-approximation algorithm 
for max sat. SIAM J. Discrete Math., 7:313-321, 1994. 

M. X. Goemans and D. P. Williamson. Improved approximation algo- 
rithms for maximum cut and satisfiability problems using semidefinite 
programming. /. Assoc. Comput. Mach., 42:1115-1145, 1995. 

J. Gouveia, P. A. Parrilo, and R. Thomas. Lifts of convex sets and cone 
factorizations. larXiv:1111.3164l 2011. 

V. Gurus wami and A. K. Sinop. Lasserre hierarchy, higher eigenvalues, and 
approximation schemes for quadratic integer programming with PSD 
objectives. In FOCS, 2011. 



18 



M. Held and R. Karp. The traveling salesman problem and minimum span- 
ning trees. Oper. Res., 18:1138-1162, 1970. 

B. Kalyanasundaram and G. Schnitger. The probabilistic communication 
complexity of set intersection. SIAM ]. Discrete Math., 5:545-557, 1992. 

A. R. Karlin, C. Mathieu, and C. T. Nguyen. Integrality gaps of linear 
and semi-definite programming relaxations for knapsack. In Proc. IPCO, 
pages 301-314, 2011. 

S. Khot. On the power of unique 2-prover 1-round games. In Proc. STOC, 
pages 767-775, 2002. 

S. Khot, G. Kindler, E. Mossel, and R. O'Donnell. Optimal inapproximabil- 
ity results for Max-Cut and other 2-variable CSPs? In Proc. FOCS, pages 
146-154, 2004. 

E. Kushilevitz and N. Nisan. Communication complexity. Cambridge Uni- 
versity Press, 1997. 

J. B. Lasserre. An explicit equivalent positive semidefinite program for non- 
linear 0-1 programs. SIAM J. Optim., 12:756-769, 2002. 

M. Laurent. A comparison of the Sherali-Adams, Lovasz-Schrijver, and 
Lasserre relaxations for 0-1 programming. Math. Oper. Res., pages 470- 
496, 2003. 

T. Lee and D. Theis. Lower bounds for sizes of semidefinite formulations 
for some combinatorial optimization problems. arXiv: 1203.3961 . 2012. 

L. Lovasz and A. Schrijver. Cones of matrices and set-functions and 0-1 
optimization. SIAM J. Optim., 1:166-190, 1991. 

E. Mossel, R. O'Donnell, and K. Oleszkiewicz. Noise stability of functions 
with low influences invariance and optimality. In Proc. FOCS, pages 21- 
30, 2005. 

K. Pashkovich. Extended Formulations for Combinatorial Polytopes. PhD the- 
sis, Magdeburg Universitat, 2012. 

A. A. Razborov. On the distributional complexity of disjointness. Theoret. 
Comput. ScL, 106(2):385-390, 1992. 

G. Schoenebeck. Linear level Lasserre lower bounds for certain k-CSPs. In 
Proc. FOCS, pages 593-602. IEEE, 2008. 



19 



G. Schoenebeck, L. Trevisan, and M. Tulsiani. A linear round lower bound 
for Lovasz-Schrijver SDP relaxations of vertex cover. In Proc. CCC, pages 
205-216. IEEE, 2007. 

A. Schrijver. Combinatorial optimization. Polyhedra and efficiency. Springer- 
Verlag, Berlin, 2003. 

H. D. Sherali and W. P. Adams. A hierarchy of relaxations between the 
continuous and convex hull representations for zero-one programming 
problems. SIAM J. Discrete Math., 3:411-430, 1990. 

M. Singh and K. Talwar. Improving integrality gaps via Chvatal-Gomory 
rounding. In Approximation, randomization, and combinatorial optimization, 
volume 6302 of Lecture Notes in Comput. Sci, pages 366-379. Springer, 
2010. 

V. V. Vazirani. Approximation algorithms. Springer- Verlag, Berlin, 2001. ISBN 
3-540-65367-8. 

M. Vyve and L. Wolsey. Approximate extended formulations. Math. Pro- 
gram., 105(2)501-522, 2006. 

D. P. Williamson and D. B. Shmoys. The design of approximation algorithms. 
Cambridge University Press, Cambridge, 2011. 

L. Wolsey. Heuristic analysis, linear programming and branch and bound. 
Math. Programming Stud., 13:121-134, 1980. 

M. Yannakakis. Expressing combinatorial optimization problems by linear 
programs (extended abstract). In Proc. STOC, pages 223-228, 1988. 

M. Yannakakis. Expressing combinatorial optimization problems by linear 
programs. /. Comput. System Sci., 43(3) :441-466, 1991. 

G. M. Ziegler. Lectures on Polytopes, volume 152 of Graduate Texts in Mathe- 
matics. Springer- Verlag, Berlin, 1995. 



A Background on Polytopes and Polyhedra 

A (convex) polytope P C R d is the convex hull conv (V) of a finite set V of 
points. Equivalently, P is a polytope if and only if P is bounded and it is the 
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set of solutions of a finite system of linear inequalities and possibly equali- 
ties. (Note that every equality can be represented by a pair of inequalities). 

Let P C R d be a polytope. A face of P is a subset F := {x G P | w T x = ^ } 
of P such that P satisfies the inequality zv T x ^ 5. A face is called proper if it 
is not the polytope itself. A vertex is a minimal nonempty face, i.e., a point. 
A facet is a maximal proper face, i.e., of dimension one less than P. The 
inequality is called facet-defining if F is a facet. The dimension of a polytope 
P is the dimension of its affine hull aff(P). 

Every (finite or infinite) set V such that P = conv (V) contains all the 
vertices of P. Conversely, let vert(P) denote the vertex set of P, then we 
have P = conv (vert(P)). Every (finite) system describing P contains all 
the facet-defining inequalities of P, up to scaling by positive numbers and 
adding an equality satisfied by all points of P. Conversely, a linear descrip- 
tion of P can be obtained by picking one defining inequality per facet and 
adding a system of equalities describing aff(P). A / 1-polytope in R d is 
simply the convex hull of a subset of {0, l} d . 

A (convex) polyhedron is a set P C R d that is the intersection of a finite 
collection of closed halfspaces. A polyhedron P is a polytope if and only if 
it is bounded. 

For more backgrou nd on polytopes and polyhedra, see the standard 
reference IZieglerl I1995I1 . 



B Proofs Missing From Section |2] 
B.l Theorem |T] 

First, let S F, Q = TU be any rank-r nonnegative factorization of S P, Q with 
r = rank + (S p '^). Consider the system 

Ax + Ty = b, y ^ (2) 

and the corresponding polyhedron K := {x G R d \ Ax + Ty = b, y 
0}. Because no column of T is zero, the system (0 defines a polytope in 
R d+r . Hence K, being the orthogonal projection of a polytope, is also a 
polytope. We verify now that P C K C Q. The inclusion K C Q simply 
follows from Ty 0. For the inclusion PCX, pick Vj G Vand observe 
that (x,y) := (vj, IP) satisfies ©, where W denotes the ;th column of U, 
because Avj + TW = Avj + b — Avj — b and IP ^ 0. Thus (0) is a size-r EF 
of the pair P, Q. Therefore, xc(P, Q) ^ rank+(S p ' Q ). 
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Second, suppose that the system 

Ex + Fy = g, y^O (3) 

defines a size-r EF of the pair P, Q. Let L C ]R d+r denote the polyhedron 
defined by ®, and let K C R d denote the orthogonal projection of L into 
x-space. Since P Q K, for each point Vj G V, there exists Wj G ]R r such 
that (vj,Wj) G L. Since X C Q, each inequality A,x ^ fc, is valid for K, 
hence for L. Let S L denote the (r + m) x n nonnegative matrix that records 
the slacks of the points (vj,Wj) with respect to y\ ^ 0, y r ^ and 
then with respect to A\X ^ b\, ... , A m x ^ b m . By construction, the sub- 
matrix obtained from S L by deleting the r first rows is exactly S P, Q, thus 
rank + (S p,t 2) ^ rank + (S L ). Furthermore, the first r rows also form a slack 
matrix of L, thus rank + (S L ) ^ r. Therefore, rank + (S p, Q) ^ r. Taking 
r = xc(P, Q) we find xc(P, Q) > rank+(S p ' Q ). 



B.2 Proposition U 




1.5CUT(3) 




Figure 1: CUT(3) and a dilate p CUT(3) for p = 1.5. 



Let Ex + Fy = g, y ^ denote a minimum size p-approximate EF 
of CUT(m). Then Ex + Fy = Ag, y ^ 0, A ^ 0, x < 1 is an EF of the 
multicut polytope MULTICUT(n). But then the size of this latter EF is at least 
2 n (v^) / and so is the size of the former EF. This is due to the fact that, for 
every graph G with n — 1 vertices, there exists a face of MU LTICUT (n) that 
projects linearly to STAB(G). Thus by I Fiorini et all 20121, Lemma 9] and 
I Fiorini et all l2012l. Theorem 101 we have xc ( MULTICUT ( n ) ) = 2 n ^) and 
so xc(CUT(n),pCUT(n)) = 2°^). The result then follows from Theorem 

m 
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C Proofs Missing From Section |3] 



C.l LemmalU 

Extending the previous notation I a and Ib, we will write Iq for the indicator 
of C for every event C. 

First we show that it suffices to consider the case n = M — 1. In general 
n = A£ — 1 + k for some k > 0. We reconstruct ^ as follows. Let H be 
a uniformly chosen random subset of size A£ — 1 of [n]. Let us choose a 
and £> as subsets of H with the distribution as in the Lemma. By reasons 
of symmetry this gives the distribution y. of a, b. The claim of the Lemma 
applied to the set H gives 

E[XJ B |H] ^ (1 - e)j^—E [XI A \H]-rp \\XI A \ H\\ 00 2-^ i+0( - lo ^ / 

with X \ H := XI^ a ^ a/b(1H y Taking expected value of both sides estab- 
lishes the Lemma. Hence it is enough to prove the lemma for n = 4£ — 1. 

Second, we redefine the distribution p. in an alternative fashion. Let 
q = Jp, and T be a uniformly chosen partition of [n] into 2 subsets T\, 
T2 with 2.1 — 1 elements and a singleton {z'}. Given T we choose a and b 
independently as subsets with I elements. We flip a biased coin to decide 
whether i is an element of a. With probability q, we select a as a uniform 
random subset of T\ U {i} of size I containing {/}. With probability 1 — q, 
we choose a as a uniform random subset of T\ of size I. We choose b sim- 
ilarly by using T2 instead of T\ . By reasons of symmetry this distribution 
is exactly ]i. Note that a and b intersect if and only if they both contain i, 
therefore 

F (B)=q 2 = p, ¥ {A) = l-q 2 = l- V . 

We are ready to prove equation It is easy to reduce the statement to 
the r = 1 case: we simply add up Q} for i = 1,. . . ,r with fj{a)gi(b) instead 
of X, and then use the estimate \\fi{a)gj(b) ■ Ia\\oo — \\X ■ Ia\\<x>- Hence we 
shall restrict ourselves to the case r = 1, and suppress the subscript 1 for 
readability i.e., f = f 1/ g = g 1 and X = f(a)g(b). 

Now E [f(a)g(b) ■ I a] and E \f(a)g(b) ■ Ib] are conveniently expressed 
in terms of the following values, which clearly do not depend on p: 

Row (T) := E [f(a) \T,i<£a], Row t (T) := E [f(a) | T,i € a] , 
Colo(T) := E \g{b) \T,i£b\, Cob. (T) := E [g{b) \T,ieb}. 
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First we note that Co1q(T) depends only on T 2 , as the distribution of b pro- 
vided T and 2 ' £ b is the uniform one on all the ^-element subsets of T 2 . 
Similarly Rowq(T) depends only on T\. We have 



E [Row (T) I T 2 ] = E 



E[/(fl)I^|T] 
P [i £ a I T] 



T 2 



E[/(fl)^ fl lT 2 ] 
1-q 

= E[f(a)\T 2 ,i £a] 



Moreover, the conditional distribution of a is the same for the conditions T, 
T 2 , (T 2 , i S a) and (T 2 , i £ a): namely, the uniform distribution of ^-element 
subsets of [n] \ T 2 . We conclude 

E [Rowo(T) I T 2 ] = E [f(fl) I T 2 ,2 £ a] = E [/(a) | T 2/ z G a] 

= E [f(a) I T 2 ] = E [/(«) I T] 

and therefore 

E [Row (T) I T 2 ] = E [Rowi (T) | T 2 ] , (4) 
E[Colo(T)|Ti] =E[Col 1 (T)|Ti], 

especially 

E [Row (T) Colo(T)] = E [Rowi(T) Col (T)] = E [Row (T) Coli(T)] . 

These lead to 

E\f(a)g(b)I A ] = (l- (? ) 2 E[Row (T)Col (T)] 

+ £? (l-< ? )E[Rowi(T) Colo(T)] 
+ (l-^)flE[Rowo(T) Coli(T)] 
= (l-^ 2 )E[Row (T)Col (T)], 

E [/(%(&) I B ] = <? 2 E [Rowj (T) Coli (T)] . 

Hence the claimed ^ for f(a)g(b) reduces to 

E [Rowi(T) Coli(T)] > (l-e)E[Row (T)Col (T)] -2-^+o(iog^) (5) 

Note that, at this point, p is essentially eliminated from the claim. How- 
ever, for convenience we set q = 1/2 (i.e., p = 1/4), which gives a nice 
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interpretation of Rowo(T) + Rowi(T) and Colo(T) + Coli(T): 

E [/(a) | T] = E [f(a) \ T, i G a] • P [z e a \ T] 



RowjCT) 



1/2 



■ E \f (a) \ T,i<£a]-F[i<£a\T] 

Row (T) 1/2 



(6) 



Row (T) +Rowi(T) 



E \g(b) | T] 



Colo ( T) + Coli (T) 



These values depend only on T 2 and T\, respectively. 

Let 5 > be a constant chosen later; essentially it will be the coefficient 
of £ in the exponent. Let T 2 be fixed. The main estimation is an upper 



bound on E 



T 2 



where x + := max{i, 0} denotes 



(Rowo(T) -Rowi(T)) + 
the positive part of a number x. Let us assume 

E If (a) I T] = E [/(«) I T 2 ] > 2-^ \\f \ ([n] \ T 2 )\L . 
In particular E If (a) | T] is positive. Note that 

E [/(«) I T 2 ) = E [/(«) I T] = * X] /(*)> 

UJ xC[n]\T 2 

|xM 

hence we can define s as a random ^-element subset of [n] \ T 2 with distri- 
bution 

f(x) . 2 M+1 



W[s = x\T 2 ] 



< 



(?) E [/(«) I T 2 ] ~ ( 2 /) ' 
Let us introduce the shorthand notation A := P [i G s | T] . Then 

E[/(fl)fe|T] 



Rowq ( T) 



Rowi(T) 



P [i £ a I T] 
2E[/-(a)|T 2 ]-P[z£s|T] 
2(1 - A) E [/(«) I T 2 ] , 
E\f(a)I iea \T] 



2B[f(a)I i(a \T] 



(7) 



P [i G a I T] 
2E \f(a) I T 2 ] -P[i G s| T] 
2AE[/(a)|T 2 ]. 



2E[/(«)J iefl |T] 



(8) 



25 



We now estimate the entropy of s. On the one hand, 

H(s\T 2 )< £ H(l jes \T 2 ) =2£E[H(A)\T 2 ]. 

je[n]\T 2 

On the right-hand side, we apply the binary entropy function to A and take 
the expected value. 

On the other hand, we clearly have 

H(s\T 2 ) =£p[ s = x|T 2 ] log 



F[s = x\T 2 ] 

(?) o 



>£P[s = *|T 2 ]log^=logA^_ 



2 V £ 



This implies 



(9) 



To estimate this expression, we use the Taylor expansion of the binary 
entropy function at 1/2: 

21n2 12(f(l -?)) 2 ln2\ 2/ 
for a suitable £ between x and 1/2. The last term is nonnegative, hence 

v ; - 21n2 

Hence we can refine ((U) to 

<5 ^/logA e[(1-2A) 2 T 2 J (E [|1 - 2A| I T 2 ]) 



- + o ^ > 



> v u - '' . (10) 



2 V I J ~ 21n2 21n2 

From (O and (JH]) we derive 

E [|Row (T) - RoWi(T)| | T 2 ] = 2E [ 1 1 — 2 A | | T 2 ] • E [Row (T) | T 2 ] . 
Finally, combining this with < fT0]> leads to 

E [|Row (T) - Rowi(T)| | T 2 ] < iVd'E [Row (T) | T 2 ] (11) 
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with 



Recall from © that E [Row (T) | T 2 ] = E [RoWi(T) | T 2 ] and hence 



S' := Is + Oi 1 ^-) )ln2. (12) 



2E 



(Row (T) -Rowi(T))" 



T 2 



= E [Row (T) - Rowx(T) | T 2 ] 
+ E[|Row (T) -Rowi(T)| \ T 2 ) 

= E[|Row (T)-Rowi(T)| | T 2 ] . 

Therefore ((TT) can be rewritten to 

E [(Rowo(r) - Rowi(T)) + T 2 ] < v^E [Row (T) | T 2 ] , 

when E[/(fl)|T]>2-^- 1 ||/r([n]\r 2 )L. 

We formulate the global version by multiplying with Colo(T), which 
depends only on T 2 , and taking expected value. Let row-big(T) denote the 
event E [f(a) \ T] > l' 61 - 1 \\g \ ([n] \ r 2 )L. Therefore we have obtained 



E 



(Row (T) - Rowi(T)) 4 " Col (T); row . big(T) 

< v^E [Row (T) Colo(T)] . (13) 



Similarly let column-big(T) denote E [f{a) \ T] > ||g \ ([„] \ T x )\\^. 

Then 



E 



Row (T)(Col (T) - Coli(T)) + I column _ big(T) 

< v^E [Row (T) Colo(T)] . (14) 



Let small (T) be the complement of the union of the events row-big (T) 
and column-big (T), i.e., either E [f(a) \ T] < \\f \ ([n] \ T 2 )|| 00 or 

E | T] < 2~'^~ 1 ||g f ([n] \ T\ ) 1 1 oo occurs. Obviously the first inequal- 
ity together with Row (T) < Row (T) + Row^T) = 2E [f(a) \ T] by © 
implies 

Rowo(T) Colo(T) < 2E [f(a) \ T] • E [g(b) \T,i£ b] 

< 2-« ii/ r (M \ t 2 )l i| g r t 2 l < 2-" ii/(«)g(&) • /aL, 
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as / and g are only considered on disjoint sets here. An analogous result 
holds when E [g{b) | T] < l' 5 ^ 1 ■ \\g \ ([n] \ 7i) ||oo- Thus 



E 



Rowo(T)Colo(T)J small(r) < \\f(a)g(b) ■ I A \L2 



(15) 



At last, we are ready to derive 10. We need the obvious bound 



Row (r) Colo(T) - Rowi(T) Coli(T) < Row (T) Col (r) 
- (Row (T) - (Row (T) -Row!(T)) + ) 
(Colo(T)-(Col (T)-Col 1 (T))+) 
- (Rowo(T) - Row! (T)) + (Colo (T) - Coli(T)) + 
= Row (T)(Col (T) - Coli(T)) + 

+ (Row ( T) - Rowx (T))+ Colo (T). (16) 

Combining (O, ([13]), (O and ([15]) we get 

E [Rowo(T) Colo(T) - Rowi(T) Coli(T)] 



< E 
+ E 



(Row (T) - Rowi(T))+ Colo(T) J row . big(T) 
Row (r)(Col (T) - Col 1 (T)) + ; column . big(T) 



E 



Row (T) Col (T)J small(T) 



< iVS'B [Row (T)Colo(T)] + \\f(a)g{b) ■ h 



We conclude 



E [Rowi(T) Coli(T)] 

> (l-2v / ^)E[Row (T)Colo(T)]-||/(%(fo)-J A || 00 2- M (17) 

We now choose the constant 5 to reduce this to (O. Therefore we require 
e = 1\fJ' , from which we express 5 in terms of e using ((T2|) : 



5' 



O 



ln2 \ £ J 41n2 
This and e = 2-sJT' reduce CEZ]> to ©. 



O 



log^ 
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C.2 Corollary E 

Let w be the distribution of inputs from above with p = 1/2. Let X = 
Yde [r] %i where the X; are rectangles for which Alice and Bob infer disjoint- 
ness; let D denote this event. Now Lemma |4] implies (deliberately using the 
error e from the statement in the lemma) 

P [D A (a n b ^ 0)] > (1 - e) P [D A (a D b = 0)] 

2 

= (l-e)(P [anb = 0] 



1/2 

- P [(-iD) A (a n fe = 0)]) - r -2-J^i n +°^ n ) 

We use the requirement of small error and derive 
1 

--e>F[DA«nl)/0] + (l-e)F[nDAflni) = 0] 

> 1 ~ e _ L 2 -lir2«+OVogn) 
~ 2 2 

We conclude 

r > e 2T^2""°( lo 8") = 2 £ln{n \ 
and with Yao's Min-Max Principle the result follows. 



D Proofs Missing From Section |U 
D.l Theorem g] 

Proof of Theorem^ The n-approximate EF of CLIQUE is trivial: it is defined 
by the system ^ x ^ 1, or in slack form x — y = 0, x + z = 1, y^O, 
z ^ 0. We claim that this defines a n-approximate EF of CLIQUE of size 2n 2 . 
Indeed, letting K = [0 7 l]" xn denote the polytope defined by this EF, we 
have PCK. Moreover, max{ (w, x) \ x G K} ^ n ^ n ■ max{ (w, x) \ x G P} 
for all admissible objective functions w of dimension n x n with a nonzero 
diagonal. In case an admissible w has Wa = for all i G [ft], we have 
max{(zp,x) | x G X} = = max{{iy,i) | x G P}. Our claim and the first 
part of the theorem follows. 

The second part of the theorem follows directly from Theorem [7| and 
the fact that Q al1 C Q. □ 
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D.2 Theorem g] 

Before giving the proof of the theorem, we state and prove a lemma. Let 
P = COR(n) be the correlation polytope and Q = Q(n) C R" x " be the 
polyhedron defined above in Section l4~Tl Although every polytope K sand- 
wiched between P and Q has super-polynomial extension complexity (by 
Theorem this even applies to polytopes sandwiched between P and pQ 
for p = 0(n 1 ^ 2 ~ e )), there exists a spectrahedron S sandwiched between P 
and Q with small semidefinite extension complexity. 

Lemma 10 (Existence of spectrahedron). Let n be a positive integer and let 
P = COR(n), Q = Q(n) be as above. Then there exists a spectrahedron S in 
R' ,x " with PQS<ZQand xc SDP (S) < n + 1. 



Proof. As shown in iFiorini et all j2012ll . there exist PSD matrices T a ' ,M\, G 



S^ +1 for a, b G {0, l} n such that (T a , U b ) = (1 - a^b) 2 for all a, b e {0, 1}". 
Let M = M(n) G R 2 " x2 " be the matrix defined as M ab = (1 - a J b) 2 . This is 
a low-rank nonnegative matrix extending the UDISJ matrix, which is also 
the slack matrix of the pair P, Q. Then M = TU is a rank-(n + 1) PSD- 
factorization of M. 

For convenience write Q = {xGlR" x "|Ax^l} with Ax ^ 1 being the 
defining system from Section |4~T1 Now consider the system Ax + Ty = 1, 
ij G S'| +1 and S := {x G R nxn | 3y : Ax + Ty = 1, y G S'| +1 }. First observe 
that SCQ: since T fl G S^ +1 for all a G {0, 1}" and y G S^ +1 we have Ty > 
and thus Ax < 1 holds for all x G S. 

In order to show that PCS recall that M is the slack matrix of the pair 
P, Q. Therefore, for each vertex x := bb T of P, we can pick y := Uj, from 
the factorization such that Ax + Ty = Ax + 1 — Ax = 1 and y G S^ +1 . It 
follows that PCS. □ 

Proof of Theorem\9\ We define S as in LemmalTOl Hence, we have P C S C Q 
and xcsdp(S) ^ n + 1. As G S this implies in particular P C S C pS C pQ 
for p ^ 1. If now X is a polyhedron such that S C K C pS then it follows 
P C X C pQ. The result follows from Theorem □ 
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